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Abstract 

This article shows the asymptotics of distribution and moments of 
the size X n of the minimal clade of a randomly chosen individual in a 
Bolthausen-Sznitman n-coalescent for n — > oo. The Bolthausen-Sznitman 
n-coalescent is a Markov process taking states in the set of partitions 
of {1, . . . , n}, where 1, . . . ,n are referred to as individuals. The minimal 
clade of an individual is the equivalence class the individual is in at the 
time of the first coalescence event this individual participates in. 
The main tool used is the connection of the Bolthausen-Sznitman n- 
coalescent with random recursive trees introduced by Goldschmidt and 
Martin (see [16]). This connection shows that X n — 1 is distributed as the 
number M n of all individuals not in the equivalence class of individual 1 
shortly before the time of the last coalescence event. Both functionals are 
distributed like the size RT n -\ of an uniformly chosen table in a standard 
Chinese restaurant process with n — 1 customers. We give exact formulae 
for these distributions. 

Using the asymptotics of M n shown by Goldschmidt and Martin in |16| . 
we see (logn) -1 log X„ converges in distribution to the uniform distribu- 
tion on [0,1] for n — > oo. 

We provide the complimentary information that ^^E(X^) — s> i for 
n — > oo, which is also true for M n and RT n . 

Keywords: minimal clade size, Bolthausen-Sznitman n-coalescent, Chinese 
restaurant process 
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1 Introduction 

The Bolthausen-Sznitman n-coalescent is a time-homogeneous Markov process 
(n| n ^) 4 > whose state space is the set of partitions of {1, ... , n}. The only possi- 
ble transitions in this process are those in which several blocks of a partition are 
merged (or coalesced) into one new block. Only one new block can be formed in a 
transition (no simultaneous mergers) . Each fc-tuple of b present blocks is merged 
to a new block at rate ^^rn k ^ ■ The Bolthausen-Sznitman n-coalescent is a 

(6—1)! 

member of the A-n-coalescent family (which were introduced independently by 
Sagitov [23] and Pitman [22]). A A-n-coalescent is again a time- homogeneous, 
continuous-time Markov process whose state space is the set of partitions of 
{1, . . . , n}. The possible transitions are again mergers of multiple blocks into a 
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new one. Each merger of k blocks among b happens with rate 

/ x k - 2 (l -x) n - k A(dx) 

J [0,1] 

for a finite measure A on [0, 1]. Note that the Bolthausen-Sznitman coalescent 
has A = C/[o.i], the uniform distribution on [0,1]. Each A-n-coalcscent can be 
represented as a random tree with n leaves {1, . . . , n} and random branch lengths 
by representing each merger as an internal node in the tree (the branch lengths 
are then the waiting times for the mergers, time is measured starting from the 
leaves) . Also note that a A-n-coalescent at time t forms a random exchangeable 
partition of {1, ... ,n}. 

The Bolthausen-Sznitman n-coalescent was introduced by Bolthausen and Sznit- 
man in 1998 (see [I]). It has connections to population genetics and physics. In 
mathematical physics, it appears in the context of spin glasses (see [4] and [5]). 
It also seems to be a suitable model for the genealogy of a sample of n alle- 
les/genes/haplotypes in several models for selection in population genetics (see 
jS], [9], [2], [19], [12] see also the survey [7]). Note that this is in contrast to 
the standard model for a genealogical tree of such a sample which is Kingman's 
n-coalescent (A = So, only 2 merger at a time, introduced in [H]). Also note 
that due to the interpretation of the Bolthausen-Sznitman n-coalescent as a ge- 
nealogical tree, we refer to {1, . . . , n} as individuals. 

Here, we focus on the Bolthausen-Sznitman n-coalescent as a model for a ge- 
nealogical tree which depicts the ancestry of n alleles sampled at a genetic locus. 
Since the genealogical tree often is endowed with a mutation structure which is 
interpreted under the infinitely-many sites model, we assume a locus consisting 
of many nucleotide sites, for example a gene. Different alleles can thus also be 
seen as different haplotypes at the according sites. One important information 
coded in the genealogy is the relatedness of an allele randomly chosen from the 
sample to the rest of the sample. There are two functionals/statistics of the 
genealogical tree which transport complementary information about this relat- 
edness. The first functional is the length E n of an external branch chosen at 
random from the n external branches associated with the leaves {1, . . . ,n} of 
the tree, introduced by Fu and Li in [15] . E n gives the time that the chosen allele 
has to evolve independently of the rest of the sample (e.g., by mutation). This 
gives a measure of the genetic uniqueness of this allele relative to the rest of the 
sample. The second functional is the size X n of the minimal clade containing the 
randomly chosen allele, introduced by Blum and Frangois in [BJ. The minimal 
clade can be defined in different, yet equivalent ways: The minimal clade is 

• the equivalence class that contains the (randomly chosen) allele i g 
{1, . . . , n} at the first time i was merged, 

• all leaves of the subtree rooted at the most recent ancestor of allele i, 

• all descendants of the most recent ancestor of allele i. 

The minimal clade can also be seen as the smallest family containing i. The size 
of the minimal clade gives the complementary information how many individuals 
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share the genealogy with allele i "after" time E n (note that since we measure 
time from leaves to root, "after" E n actually means further back in time). 
The external branch length is already analyzed well for several A-n-coalcscents 
in the literature. Its distribution follows a recursion and its asymptotics for 
sample size n — > oo are known for various A-n-coalescents (see [2], [TU], [5J, 
[17] . |13]V For the minimal clade size, though, only results for Kingman's n- 
coalescent (A — 8q, only 2 merger at a time) are known (including asymptotics 
for ?i — > oo, see [5J). 

The purpose of this paper is to analyze the distribution of the minimal clade size 
X n in the case of the Bolthausen-Sznitman n-coalescent and its asymptotics for 
sample size n — > oo. We will exploit the construction of the Bolthausen-Sznitman 
n-coalescent using a random recursive tree introduced by Goldschmidt and Mar- 
tin (see [16]) to prove our results. First, we observe that this construction yields 
that the process describing the set of relatives of a randomly chosen individ- 
ual in the Bolthausen-Sznitman n-coalescent process (which is its equivalence 
class without the individual itself) is equal in law to the time-reversed process 
describing the set of non-relatives of the chosen individual (all individuals in 
different equivalence classes than the chosen individual). This shows that the 
minimal clade size actually is distributed as the sum M n of the sizes of all blocks 
not containing 1 which participate in the last collision in the n-coalescent. Con- 
vergence in distribution of properly scaled M n for n — > oo was shown already by 
Goldschmidt and Martin in [TB] and thus the same asymptotic behavior holds 
for X n , namely (logn) -1 log X n converges in distribution to the uniform distri- 
bution on [0,1]. 

Note that due to the connection between the random recursive tree and the 
standard Chinese Restaurant process, we observe that X n — 1 and M n are dis- 
tributed as the size of a uniformly chosen table (not chosen by a size-biased 
pick!) in the Chinese restaurant process (again for M n in accordance to (T6]V 
This allows us to give several formulae for the exact distribution of X n . Using 
these, we show that ^SIl-E(X^) —} ^ for n — > oo, which gives complementary 
information to the weak convergence result. 

2 Minimal clade size in the Bolthausen-Sznitman n- 
coalescent 

Set [n] := {1, . . . , n} and [n]o := {0, . . . , n}. For a partition n of [n], let Ci{rf) 
denote the equivalence class of i G [n] and |Ci(n)| its size. Let (nj"') f > be a 
A-n-coalcsccnt. Since we want to look at the minimal clade size of a randomly 
chosen allele in the sample whose genealogy is given by (Il[ nS> )t>o, define / as 
a uniform pick from [n] independent of the n-coalescent. Now, first define the 
length of a randomly chosen external branch (associated with the randomly 
chosen I £ [n]) by 

E n :^M{t>0,C I (U i t n) )^{I}}. 
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Now we define the size of the minimal clade of the randomly chosen allele I as 

X n := |Cr(ng)|. (1) 

Note that, due to exchangeability, we don't change the distributions of E n and 
X n if we assume 1=1. Also note that due to the interpretation of a n-coalescent 
as a genealogical tree, we refer to {1, . . . , n} as individuals. 
From now on, we will abbreviate the size of the minimal clade of I = 1 with 
minimal clade size. 

The minimal clade of individual 1 is the size of the equivalence class of 1 at the 
first coalescence event that the individual participates in. In |16) . Goldschmidt 
and Martin have analysed the behavior of the total mass M n of the equiva- 
lence classes not containing 1 at the last coalescence event in the Bolthausen- 
Sznitman n-coalescent (see [16j Thm. 3.1]). Note that M n can also be written 
as n — |Ci(Il£ _)|, where r„ is the waiting time for the last coalescence event. 
Both X n and M n are functionals of the equivalence class of 1 in the Bolthausen- 
Sznitman n-coalescent at different times. Thus, it's interesting how the equiv- 
alence class of 1 changes over time. It will only grow by merging with other 
equivalence classes at coalescence times, but not necessarily at all coalescence 
times. We define as the equivalence class of 1 after the ith merging event 

which 1 participates in. What are the properties of (5,- )ie[ Kn ] , where k„ is the 
number of merging events 1 participates in? The results from |16j answer this 
question. There, the authors show a construction of the Bolthausen-Sznitman 
n-coalescent by applying a cutting procedure to a random recursive tree and 
use it, among other questions, to analyse M n . 

We will show in detail that this construction enables us to analyse the behaviour 
of and that it can be expressed in terms of a Chinese retaurant process. 
Note that this is just the line of reasoning from [IB]. Let's quickly recall the 
construction of the Bolthausen-Sznitman n-coalescent from a random recursive 
tree from |T6l Prop. 2.2] as well as the connection to the Chinese restaurant 
process. Here, we give a simplified version just constructing the jump chain of 
the n-coalescent. 

We start with a random recursive tree with n vertices, i.e. a uniformly dis- 
tributed random variable on the set of all recursive trees with n vertices 1, . . . , n 
rooted in 1 (here, the branches carry no length information). Now construct the 
jump chain as follows. 

1. Choose an edge at random 

2. Cut the tree at this edge. All labels that are in the subtree not containing 
the root are added to the node of the subtree containing the root which 
was adjacent to the cut edge. 

3. Define a partition by taking the labels at each node of the subtree contain- 
ing the root. This partition has the same law as the Bolthausen-Sznitman 
n-coalescent after the first jump. 
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4. Repeat subsequently steps 1-3 with the subtree containing the root. This 
leads to partitions which have the same law as the Bolthausen-Sznitman 
n-coalescent after the 2nd, 3rd, . . . jump. 

We now come to the connection of the random recursive tree with the Chinese 
restaurant process. 

First, recall that the standard Chinese restaurant process is a sequential con- 
struction of a uniform permutation of [n\. Imagine a restaurant with tables 
1, 2, 3, . . . with infinitely many chairs, n customers 1, . . . , n sit down at the ta- 
bles after the following rule: 

• customer 1 sits at table 1, 

• if i — 1 customers have taken their seat, the ith customer sits with equal 
probability at one of the following i places: 

— on a chair directly to the left of an already seated customer (possibly 
between customers), 

— at a previously unoccupied table. 

Writing down the customers at each table in seating order, we get the cycles 
of a uniform random permutation of [n]. If we only record the customers at 
each table, but not the seating order, we get an exchangeable partition of [n] 
whose distribution is given by Ewens sampling formula. More information on 
this process can be found in [23j Ch. 3.1]. We will abbreviate a standard Chinese 
restaurant process with n customers by CRP(n). 

A CRP(n — 1) can be found in a random recursive tree with n vertices in the 
following way (see [El p. 724-725]). We define a subtree of T in the random 
recursive tree as a rooted subtree whose root is adjacent (connected by one edge) 
to the root '1' of the whole tree. Then the subtrees of '1' form a exchangeable 
partition of {2, ... , n} which can be described as a CRP(n — 1) with customers 
labelled 2, . . . , n. The following lemma just is a write-up of the line of reasoning 
from [TOl p. 725] and gives a discrete analogon of a part of [2H Cor. 16] (in [T5] , 
the line of reasoning presented here is a part of an alternative proof for [321 Cor. 
16]) 

Lemma 2.1. (practically from Goldschmidt, Martin) Let K n be the number of 
collisions in a Bolthausen-Sznitman n-coalescent individual 1 participates in. 
For i € [k,i]oj let S^ be the equivalence class of 1 in the Bolthausen-Sznitman 
n-coalescent after the ith collision. For a CRP(n — 1) with K n _i tables, let 

RT\, . . . , RTk ti _ 1 be the tables in random order. Then = (S^ )ie[ Kn ]o is 
distributed as ({1} U Uje[il -^j')*e[.K»-i] ■ 

Moreover, the process \ {1} = (S^ \ {l}) ie [ Kn ] giving the relatives of 
individual 1 through time is distributed as the time-reversed process [n] \ = 
([n] \ 'S'k — i)i6[«n]o 9™^ n 9 the nonrelatives of individual 1. 

If the Bolthausen-Sznitman ?i-coalescent is constructed via cutting a random 
recursive tree, this lemma can be described more graphically: The equivalence 
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class of 1 grows by adding tables chosen uniformly at random from the Chinese 
restaurant process with n — 1 customers given by the subtrees of '1' in the 
random recursive tree. 

Note that we actually chose tables at random not individuals sitting at tables, 
so we don't make size-biased picks. 

Proof. We construct the Bolthausen-Sznitman n-coalescent via cutting a ran- 
dom recursive tree. The equivalence class of 1 is merged with other equivalence 
classes as soon as an edge adjacent to the root is cut in the random recursive 
tree. The equivalence class of 1 is then merged with the subtree of '1' which 
is connected by that edge. Since the edges are chosen at random, this means 
that a uniformly chosen table of the CRP(n — 1) given by the subtrees of '1' is 
merged with the class of 1. □ 

Since X n - 1 = \s[ n) \{1}\ and M n = \[n}\ S^jJ, Lemma O shows that 
X n — 1 and M n have the same distribution, namely that both are distributed 
as the size of a uniformly chosen table in a CRP(n — 1). This means that the 
known results for the asymptotics of M n which are given in |161 Thm. 3.1] are 
valid for X n — 1 and due to a Slutski argument are also valid for X n . 

Theorem 2.2. Let n G {2,3 ...}. Let X n be the minimal clade size in the 
Bolthausen-Sznitman n-coalescent. X n is distributed on 2, . . . , n. X„ is dis- 
tributed as the size of a randomly chosen table in a CRP(n — 1) reduced by 
1 and 

lQ g^» , rr 
logn 

holds in distribution for n — > oo ; where f[oi] is the uniform distribution on 
[0,1]. 

Additionally to this result, we give the complementary information of the exact 
law of X n and of the first order behaviour of all moments of X n for n oo. For 
this, we need more knowledge about the distribution of X n . 
Theorem 12.21 states that the distribution of X n can be expressed in terms of 
the Chinese restaurant process. We will use this to derive three formulae for 
the distribution of X n . Let's recall two possibilities to look at the distribution 
of customers at tables in a CRP(n). It is well known that this distribution in 
a CRP(n) is given by the celebrated Ewens sampling formula with mutation 
parameter 9 = 1 (e.g., see [U eq. 1.3]). We use two different possibilities to look 
at the Ewens sampling formula in equations ([2]) and ([3]). First, we can record 
how many tables in a CRP(n) have exactly i customers, which we denote by 
A^ n , for each i G [n]. Then for a\, . . . ,a n G [n]o with X)ie[n] = n ' wc navc 

P(iW=« 1 ,...,AW=a rl ) = n-^. (2) 

2 — 1 

On the other hand, we can record the probability that certain sets of customers 
sit at tables 1,2,... (this forms a partition r\ of [n]). The probability that we 
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find a certain partition 77 (with blocks ordered by their least element) of [n] with 
k occupied tables and n, customers at the ith occupied table is 

P(CRP(n) = rj) = I J] (m - 1)1 (3) 

' »e[fc] 

This leads to several possibilities to express the distribution of X n . 

Lemma 2.3. Let n G {2,3,...}. Let X n be the minimal clade size in a 
Bolthausen-Sznitman n-coalescent. For m G N, let A^ be the number of tables 
with exactly i customers in a CRP(m) and K m = J2ie[n-i] the number of 
occupied tables. Define Kq — ('empty restaurant') Then for j G [n — 1] 



a) Denoting T n = {01, . . . , o„_i G [n - 1] , J27=i ia i = 1—1} 
P(X n =j + l) = E 



'At* 




b) Denoting A (re, k) = {m, . . . , rife G [n], Y^i=i n i = n } f or k < n, 

1 n_1-j 1 1 

P(Xn=j + l) = - £ TT^TTT E — ^ 

J jfc ( fc+1 ) ! A(„ Z ll J -,fe) ni --- nfe 

/or j < n — 1 and P(X n = n) = ^zy- 

Lei P>i, B2, ■ ■ ■ be independent Bernoulli- distributed random variables with 
success probability =■ for £?,. 




Note that above lemma also holds true for M n and the size i?T„_i of a randomly 
chosen table in a CRP{n—\) (just replace j + 1 with j). Also note that this result 
provides a very rare example where an exact law is obtained for a functional of 
an exchangeable non-Kingman, non-starshaped n-coalescent. 

Proof. Due to Theorem 12.21 we know that X n — 1 is distributed as the 
size of a randomly chosen table in a CRP(n — 1). Given the table counts 
Aj™ -1 , . . . , A^Z\ j the probability that we randomly choose a table with j 
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customers is „„i< ,„_■> = S . Summing over the distribution of the table 

counts given by ([2]) gives a). 

Now look at the partition r\ of [n] constructed via a CRP(n — 1) whose dis- 
tribution is given by ([3]). We are interested in the partition not in order of 
least elements, but in exchangeable order (meaning that if the partition has k 
blocks, we order them randomly). Let N^ n , ... be the table sizes in 

exchangeable order. By combinatorial arguments (see [23j (2.7)]), we get 



P(iV 1 ( ^ 1 U ni ,...,iv(«- 1 ) =nfe ) 



n — 1 \ 1 1 tt , . 

n 1 ,...,njfc! (^lyyn^- 1 )- 

" * ' 

prob. of ?y, least elements 

(4) 



The size of a randomly picked table in the CRP is distributed as A^" -1 -*. This 
is just the marginal distribution from above formula, namely 

P(X„=, + l) = P(<- 1 )= J ) = ^i £ 1 . (5) 

3+E™ =2 «.»=»-i 

If j = n — 1, (|5|) equals p or 1 < j < n — 1, we have 



k = 2 ■' » 2 ,...,» fe €[n-l-f] 
1 1 

= 7 E (F+Ijl , 



ni • • • Tit 

fc=l v ' A(n-l-j,fc) 

where the last equation is due to an index shift. This shows b). 

To show c), we compare ([5]) with E([\+K n ^\-.j)~ x ). First note that for j = n— 1, 

we have Kq = and thus 

1 -e( 1 ^ 1 



re - 1 V A "o + 1 / ra - 1 

which matches the expression in b). Now assume 1 < j < n— 1. If we look at the 
table sizes in exchangeable order, we can compute P(K n -i-j = k) by summing 
up the probabilities of all possible configurations of table sizes of exactly k 



occupied tables in a CRP(n — 1 — j). Using @, this leads to 



ra - x - j i ^ i 



5^ (fc + l)! 

fc=l v ' A(n-l-i.fc) 



Comparison with ([5]) yields 



p(x„ = j + 1) = ijs? i 



Recall that K n -\-j is distributed as the number of cycles in a uniform permuta- 
tion of [n — 1 — j] . It is well-known that the number of cycles is distributed as the 
sum of independent Bernoulli variables Si, ... , £? n _i_j with success probability 
i for Bi (e.g., see [TJ p. 10]). This proves c). □ 

Remark. Let K n be the number of occupied tables in a CRP(n). Using K n = 
J2ie[n] Bi f° r independent Bernoulli variables with success probability i, we 
deduce the recursion 



i \m + K n -iJ i \m+l + K n - 

for all m £ No- This recursion gives an efficient method to compute the distri- 
bution of the minimal clade size X n by using the representation in Lemma 12.31 
c). 

Remark. In [16], Goldschmidt and Martin have proven the weak convergence 
result for M n for n — > oo by using the construction of the Bolthausen-Sznitman 
n-coalescent via cutting a random recursive tree and embedding the random 
recursive tree in a Yule process. However, as also hinted at by Goldschmidt and 
Martin (see [THl Cor. 3.3, Remark a)]), the representation of M n as a uniformly 
chosen table in a CRP(n — 1) allows to use results about uniform random 
permutations to prove the convergence part of l2.2l without using the Yule process 
embedding. 



Proof. (Alternative proof o/[ 

First, let's look at the distribution function of l °^fn-^ • x ^ P> •"•]• Using 
Lemma l2~3l a). we get 



p { \og{X n -l\ x]=p{Xn _ 1 < (n _ ir) 



log(n - 1) 





L(«-i) x J A {n ~ 1] 

v^n-l An-l) 



(6) 
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where is the number of tables with exactly i customers in a CRP(n — 1). 

The functional central limit theorem of DcLaurentis and Pittel [IT] (see also 
Hansen [20]) states 



-> (-BaOsGp),!]) 



a:G[0,l] 

in -D[0, 1] when n — > oo, where J3 is a standard Brownian motion. This implies 

— ^ x 

logn 

for x £ [0, 1]. We apply this result to both the nominator and denominator of 
the right hand side of (JHJ (inside of E(-)) and get 

1 /iC"- 1 ) losfn-ll T^n-l 4 (n-l) 



for n — > 00. Since < — „ii — < 1 for all x, n, we have uniform intcgra- 
bility and hence 

E 

for n — > 00 which shows 



^3 = 1 



V™ -1 /)( n_1 ) 

log(X„ - 1) ^ 
log(n - 1) 



^[0,1]! 



where f[o.i] is the uniform distribution on [0,1]. l °f^^ ^ behaves in the same 
way which can be shown with a Slutski argument. □ 

For the asymptotics of moments of X n (as well as M n and RT n ), we use the 
expression for Px n from Lemma 12. 31 c). namely 

P{X n = 3 + l) = -E' 



j \1 + Kn-t-jJ ' 

where K n is the number of occupied tables in a CRP(n), we will be ab. Note 
that K n also gives the number of cycles in a uniform permutation of {1, . . . , n}. 
Thus, the distribution of K n is given by 

P(K n = k) = ^ forfce[n], (7) 
n ! 

where (S n! k)ke[n],neH denote the absolute Stirling numbers of the first kind. 
It is well-known that (see, e.g., (23J Eq. 3.2]) 

K 

' 1 almost surely (8) 



logn 
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for n — > oo. Since we want to use Lemma 12.31 c). we're more interested in the 
behaviour of E((l + K n )~ r ). From ([8]), we immediately get 

; > 1 almost surely (9) 

for n — > oo. We will need a L 1 -version of ([9]). 
Lemma 2.4. 

logn . rl 

— > 1 zn _L /or n — > oo. 



1 + A" r 

Proof. The result follows from ([9]) and the uniform integrability of ^ , 
which we show now. Note that since y^T" — ^ or a ^ n ^ N, it suf- 

fices to show uniform integrability for ^p*. Let A > and iJ„ = Pn(logn) 
be a Poisson-distributcd random variable with parameter log n. Note that 

Hn = E ie[ io g n] H t\ wh ere (^ (1) ) ieN are i.i.d. with = Pn(l). For A > 1, 
we have 

f log 71 



A log n 

dP = logn £ - P (K n = k) 
fc=i 

= log" > — rr- 

fc=l 

A- 1 Iogn , . fc 



£-J ft! \ v r ( 1 + r ) V^og"-) 5 

logn~ 



<CP[H n < 



zGlogn i 

log 77, 



.4 

(1: 



< A -> 



for n — >• 00, where r = (ft — l)(logn) _1 and C is a suitable constant. Here, we 
use the uniform asymptotic expansion from Hwang (see Theorem 2 in |18j ) for 
the absolute Stirling numbers S n> k of the first kind for 1 < k < A" 1 \ogn (we 
actually use the cruder version from p] Eq. 1.30]). The convergence to follows 
from the law of large numbers for (H^ 1 )i<=N- 

This computation shows the uniform integrability of j 1 "^ 1 and thus the lemma. 

□ 

Theorem 2.5. For n G {2,3,...}, let X„ be the minimal clade size in the 
Bolthausen-Sznitman n-coalescent. For all k £ N, we have 



for n — > 00 . 



11 



Again, this theorem is also true for M n and RT n instead of X n . 
Proof. Using Lemma 12.31 c). we get 

n-i 



-gc«-i-o"*(iT*) 



1=0 

'k-V 



E % ("-ri-DT^ l 



We will now use Karamata's Tauberian theorem for power series (see [3j Corr. 
1.7.3]). It states (among other things) that if a; ~ ^l^-Ip- 1 £(/) for n — > oo, where 

c,p > and £ is a slowly varying function, then Y^ke[ri\ ak ~ r(i+p) nP £( n )- We 
define a; := Pi? (jTJfr)' Note that a; ~ j^-j for I — > oo due to Lemma [2.41 
which enables us to use the Tauberian theorem for a; with c := T(i + 1) = i\, 
p = i + 1 and C(n) = (logn) -1 . For each i e [fc — l]o, we thus have 

2 / -, \ -, £4-1 

1 n + 



for 7i — ^ oo. This shows 



A'; / i + 1 log n 



i=0 x 7 Z=0 

1 n l 



y(k-l\logn { )i J_ 

i=0 v 

fc-1 

E 



i=0 



log 77, 

k - i\ (-iy _ i 



i + 1 fc 



for 77 — > oo, where the last equation follows by elementary calculations. Thus, 
for each k £ N, we have established 

l ^mx n - D fe ) - i 

71 K 

for 77 — > oo. The theorem now is proven as 

^(^)=E (")^((^-ir)-i 

i€[fc] V 7 

for 77 — > oo □ 
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Remark. This last result fits well with the notion that X n can heuristically 
be seen as n u when n is big (U uniformly distributed on [0, 1]) following from 
Theorem 12.21 since the fcth moment of n u is r-P — . 

1 1 k log n 

We compare this heuristic to the results for X n in Kingman's n-coalescent from 
jH p. 4], where the authors state that X n , without scaling, converges to a Yule 
distribution of parameter p = 2. So in the Bolthauscn-Sznitman n-coalescent, 
the minimal clade size is much bigger than in Kingman's coalesccnt. This agrees 
with the more starlike shape of a non-Kingman n-coalescent compared to King- 
man's n-coalescent. 
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